{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "grBmytrShbUE"
      },
      "source": [
        "# High-performance Simulation with Kubernetes\n",
        "\n",
        "This tutorial will describe how to set up high-performance simulation using a\n",
        "TFF runtime running on Kubernetes. The model is the same as in the previous\n",
        "tutorial, **High-performance simulations with TFF**. The only difference is that\n",
        "here we use a worker pool instead of a local executor.\n",
        "\n",
        "This tutorial refers to Google Cloud's [GKE](https://cloud.google.com/kubernetes-engine/) to create the Kubernetes cluster,\n",
        "but all the steps after the cluster is created can be used with any Kubernetes\n",
        "installation."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "SyXVaj0dknQw"
      },
      "source": [
        "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://www.tensorflow.org/federated/tutorials/high_performance_simulation_with_kubernetes\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/docs-l10n/blob/master/site/ja/federated/tutorials/high_performance_simulation_with_kubernetes.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://github.com/tensorflow/docs-l10n/blob/master/site/ja/federated/tutorials/high_performance_simulation_with_kubernetes.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
        "  </td>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yiq_MY4LopET"
      },
      "source": [
        "## GKE で TFF ワーカーを起動する\n",
        "\n",
        "> **注:** このチュートリアルは、ユーザーが既存の GCP プロジェクトを持っていることを前提としています。\n",
        "\n",
        "### Kubernetes クラスタの作成\n",
        "\n",
        "次の手順は一度だけ実行する必要があります。クラスタは、今後のワークロードに再利用できます。\n",
        "\n",
        "GKE の指示に従って、[コンテナクラスタを作成](https://cloud.google.com/kubernetes-engine/docs/tutorials/hello-app#step_4_create_a_container_cluster)します。このチュートリアルの以下の部分では、クラスタ名が` tff-cluster `であると想定していますが、実際の名前は重要ではありません。「*ステップ 5: アプリケーションのデプロイ*」に到達したら、指示に従うのをやめます。\n",
        "\n",
        "### TFF ワーカーアプリケーションをデプロイする\n",
        "\n",
        "GCP とやり取りするコマンドは、[ローカル](https://cloud.google.com/kubernetes-engine/docs/tutorials/hello-app#option_b_use_command-line_tools_locally)または[ Google Cloud Shell ](https://cloud.google.com/shell/)で実行できます。Google Cloud Shell では、追加の設定は必要ないため、Google Cloud Shell の使用をお勧めします。\n",
        "\n",
        "1. 次のコマンドを実行して、Kubernetes アプリケーションを起動します。\n",
        "\n",
        "```\n",
        "$ kubectl create deployment tff-workers --image=gcr.io/tensorflow-federated/remote-executor-service:{{version}}\n",
        "```\n",
        "\n",
        "1. アプリケーションのロードバランサを追加します。\n",
        "\n",
        "```\n",
        "$ kubectl expose deployment tff-workers --type=LoadBalancer --port 80 --target-port 8000\n",
        "```\n",
        "\n",
        "> **注:** これにより、デプロイメントがインターネットに公開されますが、これはデモのみのためです。運用環境では、ファイアウォールと認証を強くお勧めします。"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "WK4ohHUZvVVc"
      },
      "source": [
        "Google Cloud Console でロードバランサの IP アドレスを検索します。後でトレーニングループをワーカーアプリに接続するために必要になります。"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8Lq8r5uaT2rB"
      },
      "source": [
        "### (または) Docker コンテナをローカルで起動する\n",
        "\n",
        "```\n",
        "$ docker run --rm -p 8000:8000 gcr.io/tensorflow-federated/remote_executor_service:{{version}}\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_zFenI3IPpgI"
      },
      "source": [
        "## TFF 環境の設定"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "ke7EyuvG0Zyn"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "-"
          ]
        }
      ],
      "source": [
        "#@test {\"skip\": true}\n",
        "!pip install --upgrade tensorflow_federated"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "dFkcJZAojZDm"
      },
      "source": [
        "## トレーニングするモデルの定義"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "J0Qk0sCDZUQR"
      },
      "outputs": [],
      "source": [
        "import collections\n",
        "import time\n",
        "\n",
        "import tensorflow as tf\n",
        "import tensorflow_federated as tff\n",
        "\n",
        "source, _ = tff.simulation.datasets.emnist.load_data()\n",
        "\n",
        "\n",
        "def map_fn(example):\n",
        "  return collections.OrderedDict(\n",
        "      x=tf.reshape(example['pixels'], [-1, 784]), y=example['label'])\n",
        "\n",
        "\n",
        "def client_data(n):\n",
        "  ds = source.create_tf_dataset_for_client(source.client_ids[n])\n",
        "  return ds.repeat(10).batch(20).map(map_fn)\n",
        "\n",
        "\n",
        "train_data = [client_data(n) for n in range(10)]\n",
        "input_spec = train_data[0].element_spec\n",
        "\n",
        "\n",
        "def model_fn():\n",
        "  model = tf.keras.models.Sequential([\n",
        "      tf.keras.layers.Input(shape=(784,)),\n",
        "      tf.keras.layers.Dense(units=10, kernel_initializer='zeros'),\n",
        "      tf.keras.layers.Softmax(),\n",
        "  ])\n",
        "  return tff.learning.from_keras_model(\n",
        "      model,\n",
        "      input_spec=input_spec,\n",
        "      loss=tf.keras.losses.SparseCategoricalCrossentropy(),\n",
        "      metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])\n",
        "\n",
        "\n",
        "trainer = tff.learning.build_federated_averaging_process(\n",
        "    model_fn, client_optimizer_fn=lambda: tf.keras.optimizers.SGD(0.02))\n",
        "\n",
        "\n",
        "def evaluate(num_rounds=10):\n",
        "  state = trainer.initialize()\n",
        "  for round in range(num_rounds):\n",
        "    t1 = time.time()\n",
        "    state, metrics = trainer.next(state, train_data)\n",
        "    t2 = time.time()\n",
        "    print('Round {}: loss {}, round time {}'.format(round, metrics.loss, t2 - t1))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "x5OhgAp7jrNI"
      },
      "source": [
        "## リモートエグゼキュータのセットアップ\n",
        "\n",
        "デフォルトでは、TFF はすべての計算をローカルで実行します。このステップでは、上で設定した Kubernetes サービスに接続するよう TFF に指示します。サービスの IP アドレスは必ずここにコピーします。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "sXSLXwcdciYm"
      },
      "outputs": [],
      "source": [
        "import grpc\n",
        "\n",
        "ip_address = '0.0.0.0'  #@param {type:\"string\"}\n",
        "port = 80  #@param {type:\"integer\"}\n",
        "\n",
        "client_ex = []\n",
        "for i in range(10):\n",
        "  channel = grpc.insecure_channel('{}:{}'.format(ip_address, port))\n",
        "  client_ex.append(tff.framework.RemoteExecutor(channel, rpc_mode='STREAMING'))\n",
        "\n",
        "factory = tff.framework.worker_pool_executor_factory(client_ex)\n",
        "context = tff.framework.ExecutionContext(factory)\n",
        "tff.framework.set_default_context(context)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bEgpmgSRktJY"
      },
      "source": [
        "## トレーニングの実行"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "mw92IA6_Zrud"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Round 0: loss 4.370407581329346, round time 4.201097726821899\n",
            "Round 1: loss 4.1407670974731445, round time 3.3283166885375977\n",
            "Round 2: loss 3.865147590637207, round time 3.098310947418213\n",
            "Round 3: loss 3.534019708633423, round time 3.1565616130828857\n",
            "Round 4: loss 3.272688388824463, round time 3.175067663192749\n",
            "Round 5: loss 2.935391664505005, round time 3.008434534072876\n",
            "Round 6: loss 2.7399251461029053, round time 3.31435227394104\n",
            "Round 7: loss 2.5054931640625, round time 3.4411356449127197\n",
            "Round 8: loss 2.290508985519409, round time 3.158798933029175\n",
            "Round 9: loss 2.1194536685943604, round time 3.1348156929016113\n"
          ]
        }
      ],
      "source": [
        "evaluate()"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "collapsed_sections": [],
      "name": "high_performance_simulation_with_kubernetes.ipynb",
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
